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Abstract: We show how generalized energy correlation functions can be used as a powerful 
probe of jet substructure. These correlation functions are based on the energies and pair- 
wise angles of particles within a jet, with (N + l)-point correlators sensitive to iV-prong 
substructure. Unlike many previous jet substructure methods, these correlation functions do 
not require the explicit identification of subjet regions. In addition, the correlation functions 
are better probes of certain soft and collinear features that are masked by other methods. We 
present three Monte Carlo case studies to illustrate the utility of these observables: 2-point 
correlators for quark/gluon discrimination, 3-point correlators for boosted TT/Z/Higgs boson 
identification, and 4-point correlators for boosted top quark identification. For quark/gluon 
discrimination, the 2-point correlator is particularly powerful, as can be understood via a 
next-to-leading logarithmic calculation. For boosted 2-prong resonances the benefit depends 
on the mass of the resonance. 
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1 Introduction 

The field of jet substructure has evolved significantly over the last few years [1, 2]. Many 
procedures have been developed not only for identifying and classifying jets [3-9] but also for 
removing jet contamination due to underlying event or pile-up [5, 10-14]. On the theoretical 
side, there has been substantial progress in computing and understanding these observables 
and procedures in perturbative QCD [15-23]. On the experimental side, the ATLAS and 
CMS experiments at the Large Hadron Collider (LHC) have begun measuring and testing 



- 1 - 



jet substructure ideas [24-39], with pile-up suppression becoming increasingly important at 
higher luminosities. With the recent discovery of a Higgs-like particle [40, 41], jet substructure 
methods for identifying the H — > bb decay mode [5] (and potentially the H —> gg decay mode) 
could be vital for testing Higgs properties. 

A common strategy for jet substructure studies is to first identity subjets, namely, local- 
ized subclusters of energy within a jet. Jet discrimination then involves studying the prop- 
erties of and relationship between the subjets. For example, BDRS [5] and related methods 
[8, 42, 43] involve first reclustering the jet with the Cambridge/ Aachen [44-46] or [47, 48] 
jet algorithm and then stepping through the clustering history to identify a hard splitting in 
the jet; pruning [12] is similar. iV-subjettiness [49, 50] relies on a (quasi-)minimization pro- 
cedure to identify N subjet directions in the jet. Of course, there are jet shapes such as jet 
angularities [9, 51], planar flow [7, 9], Zernike coefficients [52], and Fox- Wolfram moments [53] 
that can be used for classifying jets without subjet finding. Considered individually, however, 
these jet shapes tend not to yield the same discrimination power as subjet methods, since 
they are sensitive mainly to exotic kinematic configurations and not directly to prong-like 
substructure. 

In this paper, we introduce generalized energy correlation functions that can identify 
A-prong jet substructure without requiring a subjet finding procedure. These correlators 
only use information about the energies and pair- wise angles of particles within a jet, but 
yield discrimination power comparable to methods based on subjets. As we will see, (iV + 1)- 
point correlation functions are sensitive to A-prong substructure, with an angular exponent 
f3 that can be adjusted to optimize the discrimination power. To our knowledge, the 2-point 
correlators — schematically EiEjO^ where the sum runs over all particles i and j in a jet 
or event — first appeared in Ref. [54] and independently in Ref. [55] , with no previous studies 
of higher-point correlators. 1 

Besides the novelty of not requiring subjet finding, a key feature of the generalized energy 
correlation functions is that the angular exponent f3 can be set to any value consistent with 
infrared and collinear safety, namely f3 > 0. In contrast, observables like angularities [9, 51] 
are required to have f3 > 1 to avoid being dominated by recoil effects. 2 By choosing values 
of (3 ~ 0.2, the correlators are able to more effectively probe small-scale collinear splittings, 
which will turn out to be useful for quark/gluon discrimination. 

To put our work in perspective, it is worth remembering that the basic idea for using 
energy correlation functions to determine the number of jets in an event is actually quite old. 
As we will review, the C-parameter for e + e~ collisions [61, 62] is essentially a 3-point energy 
correlation function that can be used to identify events that have two jets. However, the 
C-parameter is defined as a function of the eigenvalues of the sphericity tensor and therefore 

1 Our definition of the energy correlation function should not be confused with Refs. [56-59], which refer to 
an ensemble average of products of energies measured at fixed angles. Here, energy correlation functions are 
measured on an event-by-event basis. 

2 As will be discussed in Sec. 2.2 and a forthcoming publication [60], TV-sub jettiness may or may not have 
recoil sensitivity depending on how the axes are chosen. 
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only gives sensible values for systems that have zero total momentum and for events that are 
nearly dijet-like. In contrast, our generalized energy correlation functions give sensible results 
in any Lorentz frame and can be used to identify any number of jets in an event (or subjets 
within a jet). In addition, they can be defined in any number of spacetime dimensions. 

The remainder of the paper is organized as follows. In Sec. 2, we introduce arbitrary- 
point energy correlation functions and define appropriate energy correlation double ratios 
Cjy ^ (built from the (N + l)-point correlator), which can be used to identify a system with 
N (sub)jets. We also contrast the behavior of Cjy with iV-subjettiness ratios. We then 
present three case studies to show how these generalized energy correlation functions work 
for different types of jet discrimination. 

• Quark/ 'gluon discrimination. Using C± (built from the 2-point correlator) in Sec. 3, 
we perform both an analytic study and a Monte Carlo study of quark/gluon separation. 
Through a next-to- leading logarithmic study, we explain why quark/gluon discrimi- 
nation greatly improves as the angular exponent approaches zero (at least down to 
(3 ~ 0.2), highlighting the importance of working with recoil- free observables. 

• Boosted W/Z /Higgs identification. Using (built from the 3-point correlator) in 
Sec. 4, we will see that the discrimination power between QCD jets and jets with 
two intrinsic subjets from a colour-singlet decay depends strongly on the ratio of the 
jet mass to its transverse momentum. This occurs because a QCD jet obtains mass 
in different ways depending on this ratio. In particular, we will see that the energy 
correlation function performs better than iV-subjettiness in situations where the jet 
mass is dominated by soft wide-angle emissions. 

• Boosted top quark identification. Using (built from the 4-point correlator) in Sec. 5, 
we find comparable discrimination power to other top-tagging methods. While one 
might worry that the 4-point correlators would face a high computational cost, we find 
that a boosted top event can be analyzed for a single value of /3 in a few milliseconds. 

We conclude in Sec. 6 with an experimental and theoretical outlook. The energy correlation 
functions are available as an add-on to Fast Jet 3 [63] as part of the FastJet contrib project 
(http : //fast jet . hepf orge . org/ contrib/). 

2 Generalized Energy Correlation Functions 

The basis for our analysis is the iV-point energy correlation function (ECF) 

/ N \ /N-l N \P 

ECF(iv,/3)= yi nn^ • ^ 

ii<i 2 <...<jjveJ \a=l / \6=1 c=6+l / 

Here, the sum runs over all particles within the system J (either a jet or the whole event). 
Each term consists of N energies multiplied together with (^) pairwise angles raised to the 
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angular exponent (3. This function is well-defined in any number of space-time dimensions 
as well as for systems that do not have zero total momentum. Note that it is infrared and 
collinear (IRC) safe for all /3 > 0. Moreover, ECF(iV, /3) goes to zero in all possible soft and 
collinear limits of iV partons. 

As written, Eq. (2.1) is most appropriate for e + e~ colliders where energies and angles 
are the usual experimental observables. For hadron colliders, it is more natural to define 
ECF(iV, f3) as a transverse momentum correlation function: 3 

/ N \ /N-l N \ ^ 

ecf(at,/3)= yi n n% . ^ 

ii<i2<--.<«jvGJ \a=l / \ b=l c=b+l / 

where Rij is the Euclidean distance between i and j in the rapidity-azimuth angle plane, 
R% = (Hi ~ Vj) 2 + (0t ~~ 4>j) 2 1 with t/i = ^ In jj?^ z ' . In this paper, we will only consider up to 
4-point correlation functions: 

ECF(0,/3) = 1, (2.3) 
ECF(l,/3) = 5> Tj , (2.4) 

ECF(2,/3)= PTiPTjiRijf, (2.5) 
j<jeJ 

ECF(3,/3)= Yl PTiPTjPTkiRijRikRjkf , (2.6) 
j<j<fceJ 

ECF(4,/3)= ^ PTiPTjPTkPreiRijRikRuRjkRjeRkef ■ (2.7) 
j<j<fc<^eJ 

If a jet has fewer than N constituents then ECF(iV, /3) = 0. Note that the computational 
cost for ECF(iV,/3) with k particles scales like k N /N\. 

From the ECF(iV, /3), we would like to define a dimensionless observable that can be 
used to determine if a system has N subjets. The key observation is that the (N + 1)- 
point correlators go to zero if there are only TV particles. More generally, if a system has iV 
subjets, then ECF(iV + 1, f3) should be significantly smaller than ECF(iV, (3). One potentially 
interesting ratio is 

(A = ECF(iV + l,/3) 
Tn ~ ECF(iV,/3) ' 1 8j 

which behaves much like iV-subjettiness rjy in that for a system of N partons plus soft 
radiation, the observable is linear in the energy of the soft radiation. 4 Of course, this is but 
one choice for an interesting combination of the energy correlation functions, and one can 
imagine using the whole set of energy correlation functions in a multivariate analysis. 

3 We will continue to use the notation ECF, though we will mainly use the transverse momentum version 
in this paper. 

4 Unlike iV-subjettiness, this ratio scales like 'y 1_JV ' 3 under transverse Lorentz boosts 7, which is somewhat 
undesirable when considering systems with several subjets. 
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In this paper, we will work exclusively with the energy correlation double ratio 

C W) = ECF(iV + l,/3)ECF(iV-l,/3) 

N rf_ x ECF(7V,/3) 2 

which is dimensionless. 5 One way to motivate this observable is that we already know that 
A^-subjettiness ratios tm/tm-i are good probes of iV-prong substructure [49, 50]. As we will 
see, the notation "C" is motivated by the fact that this variable generalizes the C-parameter 
[61, 62]. One should keep in mind that cffl involves (N + l)-point correlators, and when 
clear from context, we will drop the ^ superscript. 

The energy correlation double ratio Cm effectively measures higher-order radiation from 
leading order (LO) substructure. For a system with A~ subjets, the LO substructure consists 
of A" hard prongs, so if Cm is small, then the higher-order radiation must be soft or collinear 
with respect to the LO structure. If Cn is large, then the higher-order radiation is not 
strongly-ordered with respect to the LO structure, so the system has more than N subjets. 
Thus, if Cat is small and Cjv-i is large, then we can say that a system has A" subjets. In 
this way, the energy correlation double ratio Cm behaves like A-subjettiness ratios tn/tn-i, 
with key advantages to be discussed in Sec. 2.2. 

2.1 Relationship to Previous Observables 

While the definition of the energy correlation double ratio Cm is new, it is related to previous 
observables for e + e~ and hadron colliders that have been studied in great detail. 

An energy-energy correlation (EEC) function for e + e~ events was introduced in Ref. [54] 
for its particularly nice factorization and resummation properties. It is defined as 

EEC a = ~J2 E ^ E j\ sin %| a (! " I cos^l) 1 ^ G[(qi ■ fir)(% ■ n T )) , (2.10) 

where the sum runs over all particles in the event and ut is the direction of the thrust 
axis. This variable is IRC safe for all a < 2. The ©-function is only non-zero if the pair 
of particles is in the same hemisphere. This removes the large correlation of the two initial 
hard partons which would otherwise dominate the sum, and means that EEC a behaves much 
like the jet angularities [9, 51] with the same angular exponent a. The EEC was introduced 
because it is insensitive to recoil effects and has smooth behavior for all allowed values of 
a. In particular, EEC a has a smooth transition through a = 1, whereas angularities exhibit 
non-smooth behavior and also are increasingly sensitive to recoil effects as the angular power 
a increases. If one considers only one hemisphere of a dijet event, then EEC a is approximately 
the same as in our notation with f3 = 2 — a. Both observables are sensitive to 1-prong 
(sub)structure, and we will discuss the issue of recoil further in Sec. 2.2. 

A related two-particle angular correlation function was introduced in Refs. [21, 55, 64] 
for discrimination of jets initiated by QCD from jets from boosted heavy particle decays. The 



This double ratio scales as 7 ^ under transverse Lorentz boosts. 
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angular correlation function is denned as 

Qp{R) = Y^PTi ptjR^Q[R - Rij] , (2.H) 

where the 0-function only allows pairs of particles separated by an angular scale of R or less 
to contribute to the observable. The behavior of the observable can be studied as a function of 
R, and jets that are approximately scale invariant should have an angular correlation function 
that scales as a power of R. For a fixed value of R, the properties of the angular correlation 
function are very similar to that of EEC a and . 

As mentioned above, the notation cffi was chosen because of its relation to the C- 
parameter from e + e~ collisions [61, 62]. The C-parameter is used to identify two-jet config- 
urations without recourse to a jet algorithm or explicit jet axes choice. It is defined as 

c= 3S<>,|| Pj |sm=% (212) 
which can also be expressed in terms of the eigenvalues of the sphericity tensor. At first 

(2) 

glance, this looks very much like C\ in the sense that the numerator looks like a 2-point 
correlation function with j3 = 2. There is a crucial difference between the behavior of sin 2 Oij 
and 9fj, however, such that the C-parameter vanishes for dijet configurations when the jets 
are back-to-back (i.e. Oij = tt). If we expand around the dijet limit, then the C-parameter 
really behaves like a 3-point correlation function (i.e. like C2). To see this, note that for 
e + e~ —> qqg, the C parameter has the simple form 

C = 6 (1 " Xl)(1 " X2)(1 "" 3) , (2.13) 

XiX 2 X3 

where Xi = 2pi ■ Q/Q 2 and Q is the total four- vector of the system. 6 The angle between 
final-state particles i and j in the e + e~ — > qqg system is 

1- cos flij = 2 l ~ Xk . (2.14) 



13 



Thus, if we change 0—^2 sin | in the definition of ECF(iV, /3), then cf ) can be expressed as 

Cf OC (l-*l)(l-*2)(l-S3) (215) 
XiX 2 X?, 

which, up to normalization, is the traditional C-parameter. Of course, at higher orders in 

(2) 

perturbation theory the definitions of the C-parameter and C2 diverge. Both observables 
are sensitive to 2-prong (sub)structure, though gives sensible answers even for systems 
with non-zero total momentum and has an adjustable angular exponent (3. 



The C-parameter only properly makes sense if the total momentum of the system is zero, and so is not 
immediately applicable for hadron collisions. 
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Higher-point energy correlation functions have been studied very little in the literature. 
Two early studies for e + e~ collisions are in Refs. [62, 65]. However, both define observables 
that only make sense for systems with total momentum equal to zero and explicitly use oper- 
ations only defined in three-dimensional space, such as cross-products and the properties of 
momentum tensors with rank greater than 2. Thus, these observables cannot be easily gen- 
eralized to determine if a (boosted) system has N (sub)jets. Historically, observables like the 
D-parameter [61, 62, 66] have been used to identify peculiar phase space configurations such 
as a planar configuration of particles. However, this is not directly related to the number of 
jets in the event. Recent substructure variables like planar flow [7, 9], Zernike coefficients [52], 
and Fox- Wolfram moments [53] are similarly sensitive to peculiar phase space configurations 
rather than prong-like substructure. Planar flow, for example, vanishes if the constituents of 
the jet lie along a line, which is a good probe for some (but not all) 3-prong configurations. 
The energy correlation double ratio Cjv is designed to directly probe iV-prong configurations, 
though the high computational cost of ECF(iV + l,/3) likely limits the practical range to 
N < 3 (i.e. up to three-prongs). 

2.2 Advantages Compared to iV-subjettiness 

The variable iV-subjettiness [49, 50] (based on iV-jettiness [67]) is a jet observable that can 
be used to test whether a jet has TV" subjets, and it has been used in a number of theoretical 
[14, 20, 68-72] and experimental [27, 31] substructure studies. Since both iV-subjettiness and 
the energy correlation double ratio CV share the same motivation, it is worth highlighting 
some of the advantages of the energy correlation double ratio. 

First, a quick review of iV-subjettiness. It is defined in terms of N subjet axes ua as 



.(/?) 



^p Tj min{^,4.,...,<,} , (2.16) 



where the sum runs over all particles in the jet and Ra,% is the distance from axis A to particle 
i. There are a variety of methods to determine the subjet directions, with arguably the most 
elegant way being to minimize tn over all possible subjet directions ua [50] . If a jet has N 
subjets, then t^-i should be much larger than tjv, so the observable that is typically used 
for jet discrimination studies is the ratio 

03) 

W) = T N (2 1?) 

N,N—1 — (/3) ■ V^- 1 ' I 

T N-1 

As discussed above, this ratio is directly analogous to the energy correlation double ratio 

r W = _(/8) / J/3) 
N — N I ' N-V 

One immediate point of contrast between iV-subjettiness and the energy correlation dou- 
ble ratio is that does not require a separate procedure (such as minimization) to determine 



7 In Refs. [49, 50], 7V-subjettiness was defined with an overall normalization factor to make it dimensionless. 
Here, we remove the normalization factor so it has the same dimensions as Eq. (2.8). 
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Figure 1: Example kinematics with soft wide-angle radiation. Left: recoil of the jet axis 
(dashed) away from the hard jet core (Ei) due to soft wide-angle radiation (E%), which 
is relevant for small values of (3. Right: a three-particle configuration that highlights the 
difference between C2 and T2,i- 

the subjet directions. While novel, this by itself does not necessarily imply that Cm will have 
better discrimination power than rjy,jv— 1, though it does mean that Cm is a simpler vari- 
able to study. 8 We now explain two test cases where CV can perform better than tn,n-V 
insensitivity to recoil for C\ and sensitivity to soft wide-angle emissions for C2. 

2.2.1 Insensitivity to Recoil 

A recoil-sensitive observable is one for which soft emissions have an indirect effect on the 
observable. In addition to the direct contribution to the observable, soft radiation in a recoil- 
sensitive observable changes the collinear contribution by an 0(1) amount. An example of 
a recoil-sensitive observable is angularities for the angular exponent a > 1 (j3 < 1), which 
was studied in Ref. [54]. Because Cm is insensitive to recoils, it is better able to resolve the 
collinear singularity of QCD. 

For 1-prong jets, the effect of recoil on an observable is illustrated in Fig. la. Because 
of conservation of momentum, soft wide-angle radiation displaces the hard jet core from the 
jet axis. Angularities (i.e. 1-subjettiness) are sensitive to this displacement since they are 
measured with respect to the jet center. For a jet with two constituents separated by an 
angle 6\2 (using the notation in Eq. (2.1) for simplicity), 



In particular, P serves two different roles for 7V-subjettiness. As in C N , (3 controls the weight given 
to collinear or wide-angle emissions. In addition, when the minimization procedure is used, /3 controls the 
location of the axes which minimize 7V-subjettiness. When trying to determine the optimal value for f3 for 
subjet discrimination, it is difficult to disentangle these two effects. 




(2.18) 
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Taking the E 2 <C E\ limit one can view the first term as the contribution directly from the 
emission E2, while the second term comes about because particle 1 recoils when it emits 
particle 2. The dependence of t<« on the energies and emission angle is different according 
to the value of (3. For f3 > 1, the second term is negligible, and the angularities become 

~E 2 {e X2 f, (2.19) 

such that n is linear in the soft radiation E 2 . However, for smaller values of j3, the expression 
for angularities changes because recoil effects become important. For (3 = 1, both terms are 
identical in the E2 <C E\ limit and angularities become 

r x (1) ~ 2E 2 (6 12 f. (2.20) 

For /3 < 1, the first term is negligible in the E 2 <C E\ limit, and the angularities are dominated 
by the effect of recoil of the hard radiation 

T 09<D „ E\^E^e 12 f. (2.21) 
By contrast, the observable has the same behavior for all values of (3: 

ECF(l,/3) = ^, ECF(2,(3) = E 1 E 2 (6 12 f, => cf^ ^ , (2.22) 

which is dominated by the splitting angle and energy of the softer particle in the jet for all 
values of j3 > 0. 

Because iV-subjettiness is essentially a sum over iV subjet angularities, tjv can suffer from 
the same recoil-sensitivity as angularities for (3 < 1, depending on how the subjet axes are 
determined. For example, if iV-subjettiness is defined using subjet axes, then tn is recoil 
sensitive. A^-subjettiness is also recoil sensitive if the subjet axes are aligned with the subjet 
momenta. A related issue is that if the subjet axes are determined using the minimization 
procedure, then the minimization is only well-behaved for j3 > 1. In all of these cases, the 
useful range of (3 is limited to /3 > 1. In contrast, the energy correlation double ratio is 
recoil-free and well-behaved for the whole IRC-safe range (3 > 0. As we will see in Sec. 3 (and 
demonstrated recently in Ref. [73]), this is relevant for quark/gluon discrimination, where 
(3 ~ 0.2 for C\ is the optimal choice. 

It should be noted that one can construct a recoil- free version of A^-subjettiness where 
the subjet axes are always chosen to minimize the j3 = 1 measure (see forthcoming work in 
Ref. [60]), regardless of which f3 is used in Eq. (2.16). We refer to axes defined in this way as 
"broadening axes", since f3 = 1 is closely related to the jet broadening measure [74]. We will 
make use of this fact later when comparing C\ to 1-subjettiness in Sec. 3.3. 



9 That said, the minimization procedure does eliminate the recoil effect. 
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2.2.2 Sensitivity to Soft Wide-Angle Emissions 

Another point of contrast between Cm and tn,n-i is i n how the two variables behave in the 
presence of emissions at multiple angular scales. The way iV-subjettiness is defined, every jet 
is partitioned into N subjets, even if there are fewer than ./V "real" subjets. For example, 
when a jet has a soft subjet separated at large angle (as one might expect from the radiation 
off a quark or gluon), iV-subjettiness will still identify that soft subjet region, yielding a 
relatively low value of ttv,at-i (and therefore making the jet look more iV-prong-like than 
it really is). In contrast, because the energy correlation function is sensitive to all possible 
soft and collinear singularities, Cn takes on a relatively high value in the presence of a soft 
wide-angle subjet, making the jet look less iV-prong like (as desired). 

We can show this concretely for C2 using the configuration in Fig. lb where there is the 
following hierarchy of the energies and angles: 10 



E\ » E2, E 3 , #13 <C 6\2 — #23- 
Again using the notation in Eq. (2.1), the energy correlation functions are 



(2.23) 



ECF(l,/3) ~Ek, ECF(2,/3) ~ ^ max E 2 {e l2 f , E- 



7 13j 



yielding 



ECF(3,/3) = E X E 2 E 3 (0i 2 23 0i 3 r , 
m _ ECF(3,/3)ECF(l,/3) 



a 



E2E3 



112 



,2/3 



'13 J 



ECF(2,/3) 5 



max 



E 2 (#12) , E3 (613) 



(2.24) 



(2.25) 



For iV-subjettiness with three jet constituents, it is consistent to choose axes that lie along 
the hardest particle in a subjet. For 1-subjettiness, the axis lies along particle 1. For 2- 
subjettiness, one axis lies along particle 1 and the other axis lies along particle 2 or particle 
3, depending on the relationship between E 3 9i 3 and E 2 9i 2 - This gives 



max 



E-2 



112 



31^13 



mm 



Eo 



112 



'13 



T. 



09) 
2.1 



min [E2(9 12 f,E 3 (6 13 )P] 



max[E 2 {9i2Y,E 3 {ei 3 f} ' 
Regardless of the ordering of E 3 6\ 3 and £^#12 we see that: 



08) ~ JP) 



AP) v (a \P 



(2.26) 



(2.27) 



so in the presence of a soft subjet at large angle #12, C2 yields a larger value than t 2> \ (i.e. more 
background-like as desired). As we will see in Sec. 4, this allows C2 to perform better than 
t 2.1 for background rejection in regions of phase space where soft wide-angle radiation plays 
an important role. 



Roughly the same conclusions about C2 versus ts,i hold for the limit Ei ~ E2 3> E3 as well, which is 
relevant for the Z boson discussion below. 
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One way to understand the improved performance of C2 with respect to T21 is to consider 
the concrete example of /3 = 2 at fixed jet mass m. 11 Using the kinematic limit above, the 
jet mass-squared is given approximately by 

m 2 ~ E 1 max \e 2 {6 12 f , E 3 (0 13 ) 2 ] , (2.28) 

and it is convenient to define z as the energy fraction of the emission that dominates the mass 
(e.g. z = E2/E1 if E2 (#i2) 2 > E3 (613) 2 ). For fixed jet mass, QCD backgrounds tend to peak 
at small values of z, but we see from Eq. (2.26) that T2,i does not have any z-dependence 
for fixed jet mass. For C2, if particle 2 dominates the mass (i.e. if a soft wide-angle emission 
dominates the mass), then 

c M' x (fp? < 2 - 29 > 

so C2 penalizes small values of z. In this way, C2 acts similarly to taggers that reject jets if 
the kinematics of the dominant splitting of the jet is consistent with background [3-6, 11, 12]. 
In contrast, T2,i only exploits the degree to which radiation is collimated with respect to the 
two subjet directions, and does not take into account the z-dependence at fixed jet mass. 

If particle 3 dominates the mass (i.e. if the mass is dominated by a hard core of energy), 
then C2 is constant in the energy fraction z, and so is no longer affected by the kinematics 
of the emission that generated the mass. However, there is still the potential for improved 
performance in identifying boosted color singlet resonances like Z bosons. For a boosted 
Z boson, emissions at wide angle with respect to the angle between decay products are 
suppressed by color coherence. As one goes to higher boosts where the ratio of jet mass to jet 
Pt decreases for fixed jet radius, the volume of phase space for allowed emissions decreases, 
which can also be seen as a consequence of angular ordering. It is therefore less likely for 
a Z boson signal to generate final state radiation at large 612, while background QCD jets 
will emit at large angle independently of the pr- Because radiation at large angles has an 
enhanced effect on C2 as compared to 7^1, cf. Eq. (2.27), we expect C2 to be more effective 
at discriminating color-singlet signals from background QCD jets. 



3 Quark vs. Gluon Discrimination with C\ 

Our first case study is to use the energy correlation functions to discriminate between quark 
jets and gluon jets. The observable C\ contains the 2-point energy correlation function 
ECF(2, f$) and so is sensitive to radiation in a jet about a single hard core. 12 This case 
study is simple enough that we can predict the quark/gluon discrimination power through an 
analytic calculation, which we will subsequently validate with Monte Carlo simulations. In 
our later case studies involving higher-point correlators, we will rely on Monte Carlo alone. 

11 We thank Gregory Soyez for helpful discussions on these points. 

12 The CMS experiment uses an observable they call prD — ^2%Pu/ '(X^P«) 2 f° r quark versus gluon discrim- 
ination [39, 75]. It is related to the /3 = limit of as p T D = 1- 2C[ 0) . 
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In any discussion of quark-gluon discrimination, one should start with a reminder that 
defining what is meant by a quark or a gluon jet is a subtle task, since the one existing 
infrared-safe way of defining quark and gluon jets [76] works only at parton level. Existing 
work on practical aspects of quark-gluon discrimination in Refs. [39, 73, 75, 77, 78] has 
not entered into these issues. Instead the discussion has relied on Monte Carlo simulations, 
defining a quark (gluon) jet to be whatever results from the showering of a quark (gluon) 
parton. We will adopt a variant of this methodology in our Monte Carlo studies. Our analytic 
approach will instead define a quark or gluon jet in terms of the sum of the flavors of the 
partons contained inside it. It is based on resummation and therefore contains similar physics 
to the Monte Carlo parton shower. 

3.1 Leading Logarithmic Analysis 

We begin our analysis by considering the leading logarithmic (LL) structure of the cross 
section for the observable C\. With L equal to the logarithm of Ci, we define LL order as 
including all terms in the cross section that scale like a™L 2n , for n > 1. At LL order, quark 
versus gluon jet discrimination can be understood as a consequence of quarks and gluons 
having different color charges. To LL order, the strong coupling constant a s can be taken 
fixed and only the most singular term in the splitting function need be retained. With only 
one soft-collinear gluon emission, the normalized differential cross section for any infrared and 
collinear safe observable e has the same form for both quark and gluon jets: 

Ida a f^dd [ l dz 

-— = —C — —S(e-e), 3.1 
a de i Jo Jo z 

where C is the color factor, Rq is the jet radius, 13 z is the energy fraction of the emitted 
gluon, 8 is its splitting angle, and e is a function of z and 6. Recall that Cf = 4/3 for quarks 
and Ca = 3 for gluons. 

At this order, the observable is 

Cf } = z(l - z)eP, (3.2) 

which takes a maximum value of \Rq- So integrating Eq. (3.1) yields, for small C^f\ the 
cross section 

1 da ^sCJ_ B%_ 
a dC W tt C W m C W ■ 

We identify the logarithm L as 

L = lnJ^ (3.4) 

which we use in the following expressions for compactness. This distribution can be resummed 
to LL order by exponentiating the cumulative distribution. The resummed distribution 



13 We use this somewhat non-standard notation because R will later be used with a different meaning. 
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that follows is then 



1 da LL 2a, C L _«*fira 



e £ . (3.5) 



Because the quark color factor is smaller than the gluon color factor, the Sudakov suppression 
is less for quarks. Thus, the c[ distribution for quark jets is peaked at smaller values than 
for gluon jets. 

To figure out the quark/gluon discrimination power from this resummed distribution, 
we will make a sliding cut on and count the number of events that lie to the left of the cut. 
Adjusting this cut then defines a ROC curve relating the signal (quark) jet efficiency to the 
background (gluon) jet rejection. To LL accuracy, the (normalized) cumulative distributions 
for quarks and gluons are: 

MC[ P) ) = e-^ L \ S S (C{«) = e--^ 2 . (3.6) 
Note that at LL order, there is a simple relationship between these cumulative distributions: 

E,(Ct«)=(E ff (cf)) OA/C '. (3.7) 

Thus, if a sliding cut on retains a fraction x of the quarks, it will retain a fraction x Ca ^ Cf 
of the gluons. The quark/gluon discrimination curve is then 

disc(x) = x Ca/Cf = x 9 / 4 , (3.8) 

which (perhaps surprisingly) is independent of (3. Only beyond LL order does the discrimi- 
nation curve depend on (3. 

3.2 Next-to-Leading Logarithmic Analysis 

We continue our analysis to next-to-leading logarithmic (NLL) order, which we define as in- 
cluding all terms that scale as a™L n+1 and a™L n in InE. In addition, we will also include the 
non-logarithmically enhanced term arising at 0(a s ). At NLL order, there are several new 
effects that must be included, which together turn out to improve the quark/gluon discrim- 
ination power of compared to the LL estimate. The dominant effects are subleading 
terms in the splitting functions and phase space restrictions due to multiple emissions. In 
addition, one must account for the running of a s , fixed-order corrections, and non-global log- 
arithms [79] arising from the phase space cut of the jet algorithm. We will consider how these 
affect the discrimination power of c[ , ultimately showing that small values of [3 improve 
quark/gluon discrimination. We will work in an approximation of small jet radius, Rq <C 1, 
which will allow us to consider only the effects of radiation from the jet, while neglecting 
modifications associated with the full antenna structure of initial and final-state partons. 

The resummation to NLL for generic (global) observables was carried out in Ref. [54]. 
The central result of that analysis was an expression for the NLL cumulative distribution 
for an arbitrary observable (satisfying certain basic conditions, e.g. recursive infrared safety). 
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From Ref. [54], the probability that the value of an observable is less than e L takes a form 
such as 

p -ieR' fin 

^- N w^f R - R ' s £- (3 ' 9) 

where N is a matching factor to fixed order, N = 1 + O (a s ), and 7^ ~ 0.5772 is the Euler- 
Mascheroni constant. In a fixed- coupling approximation, the "radiator" function R for the 
observable c[ is 

R=*%(L + B)\ (3.10) 

7T p 

where C is the color factor of the jet and B encodes subleading terms in the splitting func- 
tions. 14 For quark jets B q = — | and for gluon jets B g = — ^ + where nj is the number 
of light quark flavors. The specific NLL resummed formula in Eq. (3.9) holds for observables 
that are global, recursively infrared and collinear safe (rlRC), and additive. The last two 
conditions are satisfied by c[ . The general expression for R with running a s appears in 
Ref. [54]. The scale at which a s is evaluated is ptRo, and we will use the shorthand 

a s = a s (p T Ro) , (3.11) 

unless an explicit scale is used as the argument of a s . Because c[ for a jet is non-global, it 
is necessary to include an extra factor in the resummation, discussed in detail in Sec. 3.2.3. 
We will also include information obtained by matching to the 0{a s ) fixed-order cross section, 
where the matching procedure is described in App. A. 

Armed with the matched NLL cumulative distribution, including the non-global and 
O (a s ) corrections, we can now determine the quark versus gluon discrimination curve by 
numerically inverting T, q and plugging it into the expression for E g . This is shown for various 
values of (3 in Fig. 2a. In Fig. 2b, we fix 50% quark efficiency and show the gluon rejection 
rate (i.e. one minus the gluon efficiency) as a function of (3 for Rq = 0.6. Also on this plot 
is an approximate analytic expression for the rejection rate as a function of f3 that we derive 
below in Eq. (3.22). We see that the discrimination power improves as (3 decreases. It is, 
however, not sensible to take /3 too small: for (3 = our observable is collinear unsafe, and 
large non-perturbative effects can be expected as f3 approaches zero. Furthermore for (3 < a s 
the convergence of our calculation breaks down (cf. App. B). 

To understand the behavior of Fig. 2b semi-analytically, we will study the impact of 
different physical effects on the discrimination. To do so, we will again express S 9 in terms of 
T, q so as to determine the discrimination power of a cut on C\. In fact, we are most interested 



4 To obtain Eq. (3.10), we used the fact that, for a general jet observable that takes the form 



fpr^Y (Ik 

\Ptj) \Ro 



where Ri is the angle of the emission, Eq. (2.19) in Ref. [54] applies for a — A, b = B — A, and d = 1, where we 
identify the scales Q = Q12 = 2Ee = pr.jRo- The sum over £ — 1, 2 in Eq. (2.19) is replaced by the individual 
contribution 1 = 1, 
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Figure 2: Left: Quark/gluon discrimination curves using C[ , calculated at NLL order 
matched to fixed order for various values of f3. The /3-independent LL prediction is shown 
for comparison. Right: Gluon rejection rates at 50% quark efficiency, as a function of /3, 
demonstrating that j3 ~ 0.2 is optimal at NLL order (for smaller values of (3, non-perturbative 
effects become important). Also shown is an analytic approximation from Eq. (3.22) (Ci 
Approx.) that includes the most important physics that enters at NLL. 

in the exponent relating S 9 to S g (as in Eq. (3.7)), so we will actually relate the logarithms 
of the two cumulative distributions to one another. We are interested in the regime where 
lnl/E ~ 1, which, from Eq. (3.6), implies that a s L 2 ~ 1. The logarithm of the cumulative 
distribution has the schematic expansion 

In £ ~ a s L 2 + a s L + a s + a 2 s L 3 + a 2 s L 2 + a 2 L + a 2 + 0(a 3 s ) . (3.12) 

With the power counting of a s L 2 ~ 1, we will consider all terms from Eq. (3.12) that scale 
as a®, a l J 2 , or a\. This corresponds to all terms at order a s from Eq. (3.12), as well as the 
terms at a 2 L 3 , a 2 L 2 , and af.L 4 . To illustrate this power counting, consider, for example, the 
term a s L, which scales as a> l J 2 as one varies a s while keeping a s L 2 fixed and of order 1. 

In what follows we will pay special attention to the terms at order a s L and a 2 L 2 , which 
turn out to be the most relevant ones when establishing deviations from our LL analysis and 
whose dominant contributions have clearly identifiable physical origins. The terms at order 
a 2 L 3 and a^L 4 are simply proportional to the LL color factor, multiplied by powers of the 
/3-function, and so do not significantly modify the LL analysis. 

3.2.1 Subleading Terms in Splitting Functions 

We first consider the effect on the discrimination from the subleading terms in the splitting 
functions. In the observable c\ , (3 controls the weight given to collinear and wide-angle 
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emissions in the jet. At large values of (3, wide-angle emissions are given greater weight, and 
at small values of /3, collinear emissions are given greater weight. Wide-angle soft radiation 
is controlled by the term in the splitting function that diverges as the energy fraction goes 
to zero; i.e., the term dz/z. Both quarks and gluons have the same functional form for the 
soft limit of the splitting function, with the only difference being the overall color factor. By 
contrast, collinear emissions are controlled by the subleading terms in the splitting function, 
which differ for quarks and gluons (i.e. different values of the B coefficient). Therefore, as 
(3 goes to zero and the collinear emissions become more important in , the differences 
between the quark and gluon splitting functions are accentuated. 

To see this behavior directly from Eq. (3.9), we can ignore the -R'-dependent prefactor 
and focus on the e~ R factor. We can write B g = B q + 5B, where 



SB - ^ , (3.3, 



which is | for nj = 5. We then have 



C F q V L + B, 



This last form allows us to relate the cumulative distribution for gluons to that of quarks, in 
the same spirit as Eq. (3.7): 




This implies that the separation between the quark and gluon distributions increases as f3 
decreases and so smaller values of (3 result in better discrimination. Because this effect first 
arises at O ^^/a s ), there will be corrections at O (a s ) due to the running coupling. Note also 
that the coefficient SB is quite small in QCD, and so the total effect from the subleading 
terms in the splitting functions on the discrimination power is minimal. 

3.2.2 Multiple Emissions 

Next, consider the effect of multiple emissions. The Sudakov logarithm corresponds to the 
integral of the area (in In kt, ln# space) over which emissions are forbidden. At LL, any 
number of emissions can lie arbitrarily close to the lower boundary of the phase space region 
without changing the value of the observable. At NLL, one must consider the cumulative effect 
of the emissions that lie near the phase space boundary. Multiple emissions tend to increase 
the value of the observable C±, and so, for a fixed value of C±, they must be suppressed. This 
introduces an extra degree of discrimination between quarks and gluons; there are likely to 
be more such emissions for gluons than quarks and so it costs more to "accept" a gluon jet. 



- 16 - 



For a given LL Sudakov factor, the extent of the boundary region is effectively increased as 
jS is decreased, leading to better quark versus gluon discrimination at small j3. 

In Eq. (3.9), the effect of muliple emissions is seen in the independent prefactor. For 
small values of R' , the prefactor has the expansion 



e -i E R' n 2 



T(l + R') 12 



R' 2 + (R' 3 ) 



_ x _^c R+o{ ^y {316) 

We will drop terms at O (a 2 LR/ (3 2 ) and higher, which constrains us to consider /3 > a s L. 
The cumulative distribution can then be written approximately as 

ln£~-i?fl + ^|a s ) , (3.17) 

which allows us to relate £ 9 in terms of S g as 

C^ l+jf^ C A ( Att Ca-Cf \ 

= c~ Fl + ^ a ^ q ^-(1 + T2 ^^«s) lnE 9 . (3.18) 

This again suggests an increase in discrimination power for relatively small (3. While this 
effect appears at order a s rather than y/al, it has a substantially larger coefficient. 

3.2.3 Non-Global Logarithms 

Because jets are defined in a restricted phase space, non-global logarithms may contribute 
to the quark versus gluon discrimination power. The effect of non-global logarithms on the 
cumulative distribution can, for our purposes, be approximated in the large- Nc limit as [79- 
81] 

2 2 2 

S w ith ng = e- CCA ^^ L2 Y, = e -^?2^£ . (3.19) 

This neglects some contributions starting at order a^L 3 in the exponent, but these would 
not affect the quark-gluon discrimination at our accuracy. Recently a first numerical calcu- 
lation has been performed including the full-iVc structure [82] and it suggests that finite- Nc 
corrections are small. 

If we temporarily ignore the -R'-dependent prefactor in Eq. (3.9), the inclusion of non- 
global logarithms leads to 

lnS withNG ~-i?M -C A — . (3.20) 

All quark/gluon dependence resides in the color factor inside R, so we still have the property 
from the LL calculation (again, ignoring the prefactor and setting 5B = 0) 

£ 9 ,withNG(£) = P 9 ,withNGW] C - 4/CF • (3-21) 
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Hence non-global logarithms do not modify the above arguments in any significant way. 

This analysis holds for the anti-£?T jet algorithm, whose boundary is unaffected by soft 
radiation at angles ~ Rq. For other algorithms of the generalized- k F family, which have 
irregular, soft-emission-dependent boundaries, there are additional terms, clustering loga- 
rithms [83, 84], which also appear starting from order a 2 L 2 . Some of the O (a 2 ,) clustering 
logarithms involve color factor combinations such as C F and C\ for quarks and gluons re- 
spectively, and so presumably would have an impact on quark-gluon discrimination at our 
accuracy. We leave the study of these terms for future work. 

3.2.4 Summary of NLL Result 

Using the results of Ref. [54] and App. A to include all effects up through 0(a s ) in the 
logarithm of the cumulative distribution we find 



i ^ C A L n F - C A a s C F rip — C a a s b ,„ 
lnS *" C~ F y + ^lulT^ + ^^3^ (2 " A 

i a s -K C A -C F 17 a s C F n f -C A \ , . 

+ ~3 P 36T<^/?lnl/£ g ^ 22 > 

This expression includes two terms beyond those discussed in the subsections above. The 
one proportional to bo, where &o = ^Ca — §"/ is the one- loop /3-function coefficient, has 
two origins: it comes from the running coupling corrections to the contribution from the 
subleading terms in the splitting functions and from the running-coupling corrections to the 
relation between the logarithm L and lnS 9 . The last term in parentheses comes from O (a s ) 
matrix element corrections, discussed in detail in App. A. It depends on the choice of the jet 
definition, including the procedure by which one defines quark versus gluon jets at parton 
level. Specifically, we assume any algorithm is equivalent to the generalized k? family of 
jet algorithms at order a s , and at this order define a quark jet to be one that contains a 
quark and a gluon, while jets containing gg or qq are considered to be gluon jets. 15 Beyond 
O (a s ), the calculation assumes that the algorithm maintains a rigid circular boundary in the 
presence of multiple soft particles at angles of order Rq, i.e. that it behaves like the anti-A^ 
algorithm. 

Note that every subleading term in Eq. (3.22) is proportional to a difference of color or 
quark number factors and so the discrimination power depends sensitively on these differences. 
The overall quark versus gluon discrimination power increases as j3 is decreased (even though 
the last term favors larger values of f3 for rif > Ca)- Numerically, this behavior is dominated 
by the subleading terms in the splitting functions and the multiple emissions effect. The 
effect of the subleading terms in the splitting functions goes like y / a7 and so is formally 
more important than the multiple emissions effect which is 0(a s ). However, the effect of 

15 In contrast to the situation with LO studies, at O (a s ) it does not makes sense to discuss jet flavor based 
on the flavor of the parton that "initiates" the jet, since interference effects between diagrams mean that the 
initiating parton cannot be uniquely identified. The question of quark-gluon jet definition at fixed order is 
discussed further in App. A. 
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the subleading terms is multiplied by the small number 5B and so is numerically smaller 
than the contribution from multiple emissions. Running- coupling and fixed-order effects are 
significantly smaller. 

Robustly, then, smaller values of /3 lead to better discrimination between quark and gluon 
jets. One explicitly sees that we have an expansion in powers of y/a s //3, and so it can only be 
trusted for /3 substantially larger than a s ; in practice, perhaps (3 > (2 ~ 4) x a s (see App. B). 
It is interesting to comment also on traditional angularities: for (3 > 1 most of Eq. (3.22) 
still holds, and only the last term in parentheses would be modified. However, for (3 < 1 
angularities are dominated by recoil effects, with a structure that is independent of /3, and 
so we expect that the discrimination should saturate. Because the energy correlation double 
ratio is recoil- free for all values of /3, it is better able to probe the collinear singularity 
and multiple emission effects that distinguish quarks from gluons. 

3.3 Monte Carlo Study 

We now use a showering Monte Carlo simulation to validate the above NLL analysis of 
A similar study of the EEC function appears in Ref. [73], where it was called the 
two-point moment. 16 Through this paper, jets are identified with the anti-fc-r algorithm [85] 
using Fast Jet 3.0.3 [63]. No detector simulations are used other than to remove muons and 
neutrinos from the event samples before jet finding, as was done in analyses for the BOOST 
2010 report [1]. 

We generate pure quark and gluon dijet samples from the processes qq — > qq and gg — > gg 
in Pythia 8.165 [86, 87] at the 8 TeV LHC using tune 4C [88]. While Pythia is not fully 
accurate to NLL, it does include subleading terms in the splitting functions and multiple 
emissions, so not surprisingly we find improved discrimination at smaller values of f3, in 
agreement with Sec. 3.2. We scan over various jet radii and px cuts to study the dependence of 
the quark/gluon discrimination on these parameters. For this study, we only use the hardest 
reconstructed hadron-level jet in the event with a transverse momentum in the ranges of 
p T E [200,300] GeV, [400,500] GeV, or [800,900] GeV. 17 If the hardest jet in the event lies 
outside the pt range of interest, the event is ignored. In addition, we scan over jet radii values 

16 Ref. [73] examined the Ci quark-gluon discrimination for a range of /3 values and reached a conclusion that 
is consistent with ours. While their initial analysis naively suggests that C\, Fig. 18, performs worse than jet 
broadening ("girth", or equivalently n with /? = 1), Fig. 13, that comparison involves different Monte Carlo 
event samples. Table 1 of Ref. [73] compares the observables on equal footing, which shows that C\ indeed 
has better discrimination power than jet broadening, consistent with our discussion here. 

17 The reason for focusing only on the leading jet is that we want to minimize ambiguities related to defining 
quark and gluon jets. The subleading jet is the one more likely to have undergone radiation, and with 
radiation, quark jets may change into gluon jets, and vice-versa. Additionally, the local emission environment 
is changed (e.g. non-global logs may become more important). The probability that an event has radiation 
in the vicinity of the subleading jet is 0(a s ), while it is O (ctj) near the leading jet. As a cross-check on 
the flavour composition of our events, we have clustered the parton-level showered events with the flavor-fc t 
algorithm [76]. We find that the flavor of the leading jet is consistent with expectations except in a small 
fraction of events, between a few percent and ten percent depending on the generator. 
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Figure 3: Left: Distribution of C\ for quark jets (purple) and gluon jets (orange) using 
Pythia dijet samples. The sample consists of anti-/c*r jets with radius R = 0.6 and transverse 
momentum in the range [400,500] GeV. Right: Quark versus gluon discrimination curves 
using Cf for several values of (3 in Pythia. Also plotted is the leading log approximation 
for the discrimination curve, Eq. (3.8). 
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Figure 4: Gluon rejection rates at 50% quark efficiency in Pythia, as a function of /3. 
Left: fixing the pr range to be [400, 500] GeV and sweeping the value of Rq. Right: fixing 
Ro = 0.6 and sweeping the pt range. For all of these cases, small values of /3 yield the best 
discrimination. 



of Rq = 0.4, 0.6, and 0.8. Because our broad conclusions hold for all samples generated, we 
only show representative plots to illustrate the quark/gluon performance of C\. 

In Fig. 3a, we plot the distribution of c[ ' 2 ^ for jets initiated by quarks and gluons with 
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Quark Efficiency [5 
(a) (b) 



Figure 5: Left: Quark/gluon discrimination curves using jet angularities r{ (i.e. 1- 
subjettiness measured with respect to the jet axis), for several values of (3 in Pythia. Also 
plotted is the leading log approximation for the discrimination curve from Eq. (3.8) and the 
discrimination curve for cf 1 ' 2 ^ . The jet sample is the same as used in Fig. 3b. Right: Gluon 
rejection rate for 50% quark efficiency as a function of (3, for angularities, 1-subjettiness mea- 
sured with respect to the broadening axis, and C^f\ The broadening axis is defined as the 
axis which minimizes the /3 = 1 measure in iV-subjettiness. The latter two observables are 
recoil-free, and therefore give better discrimination power for small values of f3. 



transverse momentum in the range [400, 500] GeV and jet radius R = 0.6 in Pythia. As 
expected, the gluon curve lies at larger values than the quark curve because of the greater 
Sudakov suppression in gluon jets. The quark/gluon discrimination curves for different values 
of (3 are shown in Fig. 3b, which are directly comparable to the NLL results in Fig. 2, up 
to jet contamination effects included in Pythia such as underlying event and initial-state 
radiation. Again, we see that f3 ~ 0.2 is the optimal value. In Fig. 4, we show the gluon 
rejection rate for 50% quark efficiency as a function of j3, comparing different px ranges and 
Rq values, all of which favor small values of f3. Note that the gluon rejection power degrades 
as the jet radius is increased, exhibited in Fig. 4a. This may be associated with the increase 
in the amount of underlying event and initial-state radiation captured in the jet as the jet 
radius increases. This radiation is uncorrelated with the dynamics of the quark or gluon 
which initiates the jet. The degradation is most prominent at large values of f3, where wide 
angles in the jet are emphasized (which is where most of the uncorrelated radiation resides). 

To compare the discrimination power of to other IRC safe observables, we consider 
1-subjettiness t[ defined in Eq. (2.16). We allow for two different axis choices: the jet axis 
and the broadening axis (i.e. the axis that minimizes the (3 = 1 measure). When measured 
with respect to the jet axis, t[ is essentially the same as the jet angularities r a with a = 2— (3. 
Angularities coincides with familiar observables for particular values of ft: (3 = 2 is jet thrust 
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Quark Efficiency [5 
(a) (b) 



Figure 6: Left: Quark versus gluon discrimination curves using C[ for several values of f3 in 
HerwigH — h (directly comparable to Fig. 3b). Also plotted is the leading log approximation 
for the discrimination curve, Eq. (3.8). Right: Gluon rejection rate for 50% quark efficiency 
as a function of /3, for angularities, 1-subjettiness measured with respect to the broadening 
axis, and in Herwig+H- (directly comparable to Fig. 5b). We also tested Pythia 6.425 
and Herwig 6.520, whose results lie in between Pythia 8 and Herwig++. 



and (3 = 1 is jet broadening or girth. Among the angularities, Ref. [77] found that jet 
broadening (/3 = 1) was the most powerful angularity for quark/gluon discrimination, and so 
is a natural benchmark to compare to cj^ . When measured with respect to the broadening 
axis, is a recoil-free observable and is therefore expected to behave similarly to c[^\ 

In Fig. 5a we plot the discrimination curves for angularities (i.e. 1-subjettiness measured 
with respect to the jet axis) for several values of f3, as well as the discrimination curve for 
cj ' 2 ^ in Pythia. Indeed, for most of the range, the most discriminating angularity is (3 = 1, 
but the performance of all angularities is roughly comparable to and only somewhat better 
than the LL expectation. By contrast, C^ ' 2 ^ yields a quark to gluon efficiency ratio that is 
about twice as large as any of the angularities over much of the range. In Fig. 5b, we highlight 
the importance of working with recoil-free variables, by plotting the gluon rejection rate at a 
fixed 50% quark efficiency. For (3 > 1, the energy correlation double ratio and 1-subjettiness 
have essentially the same performance. As /3 approaches 0, however, the discrimination power 
for the angularities degrades, while the two recoil-free observables (c[^ and 1-subjettiness 
with respect to the broadening axis) have improved performance, as expected from the NLL 
analysis. 18 

To verify the claims made about the performance of cj^ as a quark/gluon discriminator, 
we also simulate quark and gluon dijet samples in HerwigH — h 2.6.3 [89, 90]. We use the same 

18 The reason for the mismatch between Ci and ti with respect to the broadening axis at very small values 
of /3 has not yet been determined. 
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kinematic cuts and jet algorithm parameters as in the Pythia samples. As the same quali- 
tative conclusions hold in the Herwig++ samples as in Pythia, we only reproduce Figs. 3b 
and 5b for the HerwigH — h sample. In Fig. 6a, we plot the quark versus gluon discrimination 
curve with cj . While the discrimination power of in the HerwigH — h sample is not as 
great as in the Pythia sample, the behavior that the discrimination increases as (3 decreases 
is robust. In Fig. 6b, we plot the gluon rejection rate at a fixed 50% quark efficiency for the 
three observables considered earlier. As in Fig. 5b, the discrimination power of the recoil- 
free observables increases as (3 decreases and degrades for the recoil-sensitive angularities 
(though the /3-dependence is once again weaker than with Pythia 8). We also tested 
using Pythia 6.425 [86] with tunes DW and Perugia 2011 [91] and Herwig 6.420 [92] with 
JIMMY [93] , which exhibit discrimination power that is intermediate between Pythia 8 and 
HerwigH — K We also checked that the behavior is robust as underlying event, initial-state 
radiation, and hadronization are sequentially turned off. 

Of course, there is a substantial numerical difference between Pythia 8 and Herwig+H- 
for quark versus gluon discrimination. Some of distinction between Pythia 8 and HerwigH — h 
could be due to the fact that different evolution variables are used: Pythia 8 is a p^-ordered 
shower whereas HerwigH — h is angular ordered. This could in turn affect the flavor content 
of the quark and gluon jets, thus leading to variations in their ability to discriminate quark 
and gluon jets. The energy correlation function observables seem to be particularly sensitive 
to these differences, especially at relatively small values of the angular exponent (3. This 
suggests that any detailed study of the properties of quark and gluon jets should measure 
C\ at multiple values of (3. Beyond discriminating quark and gluon jets, measurements of 
energy correlation functions at both e + e~ and hadron colliders could be useful for Monte 
Carlo tuning, especially given the current differences between generators. 

In these studies, G\ has been measured on jet samples which include both charged and 
neutral hadrons (and we have not applied realistic smearing of energies and angles). In 
order to exploit the discrimination power of C\ with (3 ~ 0.2, one needs excellent angular 
resolution, so in an experimental context, it may be advantageous to measure C\ using only 
charged hadrons. A track-only calculation of C\ , using e.g. the methods of Ref. [94] , is beyond 
the scope of this work, but we did verify in Monte Carlo that the quark/gluon discrimination 
power only degrades by a few percent when using a track-only version of C\. 

4 Boosted Electroweak Bosons with C 2 

Our next case study is using C2 to discriminate between QCD jets and jets with two intrinsic 
subjets, such as boosted W, Z, or Higgs bosons. 19 Recall that C2 involves the 3-point 
correlator. To identify a boosted resonance, one first looks for jets whose mass is compatible 
with the resonance of interest. Then C2 can be used to determine if the jet has two hard 
subjets, in which case the jet is tagged as coming from that boosted resonance. While we have 
not carried out analytic calculations to guide us to understand the performance of C2 (and 

19 For related studies, see Refs. [3-5, 9, 49, 95-108]. 
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higher point correlation functions), it is still instructive to study its discrimination power in 
Monte Carlo. We use Pythia 8 to demonstrate the qualitative behavior and performance 
of C2, though one must of course be mindful of the quantitative differences in Monte Carlo 
programs seen already in Sec. 3.3. A calculation of C2 will be left to future work. 

The key finding from this section is that this tagging procedure is very sensitive to the 
ratio of the jet mass to the the jet transverse momentum. This arises because the structure 
of the QCD background depends strongly on the jet mass requirement, and the behavior of 
differs depending on what type of radiation contributes dominantly to the jet mass. For 
a fixed jet pr, we will find that small values of /3 ~ 0.5 are better for high mass resonance 
discrimination, whereas large values of /3 — 2 give the optimal separation at lower masses. 
In both regimes, C2 offers better discrimination power than T21, with the difference being 
more pronounced for small m/px- After describing this physics for a generic boosted 2- 
prong resonance, we will specialize to the case of the Higgs boson, where additional 6-tagging 
information is available. 

4.1 Dependence on the Mass Criterion 

Consider a quark or gluon jet with invariant mass close to the boosted resonance of interest 
(which we will call a Z boson for concreteness). For jets with mass comparable to their trans- 
verse momentum, the mass is dominated by a single, relatively hard, perturbative splitting. 
Thus, one expects that the QCD jets that can fake a Z boson are those with two relatively 
hard cores of energy surrounded by soft radiation. These jets are straightforward to analyze 
in fixed-order perturbation theory (to generate the jet mass) matched to resummed pertur- 
bation theory (to generate the radiation pattern for C2), since there is a clear ordering of 
emissions in the jet. In particular, QCD jets with large mass should appear similar to jets 
initiated from heavy resonance decay, with differences controlled mainly by the color of the 
decay products and the phase space of the hard splitting. 

For many systems of interest, however, the above analysis is not appropriate. Once the jet 
mass is less than around a fifth of the jet transverse momentum times R, the mass no longer 
arises dominantly from a hard perturbative splitting. For jets in the low to intermediate mass 
ranges, a significant mass can be generated by a single soft emission from a single hard core. 
At lower masses, the mass of a jet is generated by multiple soft emissions. Jets in the low and 
intermediate mass regimes require resummation of these soft emissions to accurately model 
their dynamics as fixed-order perturbation theory is no longer accurate. For this reason, we 
expect QCD jets in this mass regime to look very different from boosted resonances with two 
hard cores, and the discrimination power of C2 should improve as the ratio of the jet mass to 
the transverse momentum decreases. In addition, as discussed in Sec. 2.2.2, C2 is better able 
to exploit the color singlet nature of the Z boson when m/pT is small. 

To illustrate this, we generate a mixed sample of quark and gluon jets from pp — > Zj with 
the Z decaying to leptons. These are simulated at the 8 TeV LHC in MadGraph5 1.5.0 
[109], showered in Pythia 8.165 [86, 87], and we identify the hardest anti-&T Ro = 1-0 
jet. In Fig. 7 we plot the invariant mass spectrum of QCD jets in three different transverse 
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Figure 7: Distribution of mass of QCD jets in the process pp —> Zj as simulated in Mad- 
Graph5 1.5.0 and showered in Pythia 8.165. The transverse momentum of the jets lie in 
one of the ranges of [200, 300], [400, 500] or [600, 700] GeV, as labeled on the plot. 




(a) (b) 
(2) 

Figure 8: Distributions of for QCD jets (left) and Z bosons decaying to jets (right) 
with different masses of the Z. The transverse momentum of the jets for all masses lies in the 
range of [400, 500] GeV. The different curves correspond to different event samples according 
to the mass of the resonance. 



momentum bins, p T E [200, 300] GeV, [400, 500] GeV, and [600, 700] GeV. We see that the 
mass distributions in each px bin have steeply falling tails extending to masses of about 
Pt/2. In the tail region, we expect fixed-order perturbation theory to accurately describe the 
origin of mass of the jet. At lower masses, below about pr/5, Sudakov suppression becomes 
important as the distributions peak and decrease toward zero mass. In this mass regime, 
fixed-order perturbation theory is no longer adequate to describe the distribution. 

This differing origin of the jet mass is reflected in the distributions. Because small 
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Figure 9: Left: the discrimination curves for boosted hadronic Z bosons (mz = 91 GeV) 
compared to QCD jets with cljP for various values of /3. The transverse momentum of 
all jets was required to lie in the range of [400, 500] GeV. Right: QCD rejection rate for 
50% boosted Z efficiency as a function of (3, sweeping the value of the Z boson mass to 
mz = {80,91,110,125,150,200} GeV. The optimal value of (3 depends strongly on the 
resonance mass. 



values of C)f correspond to 2-subjet-like jets, the C\ distribution moves to lower values as 
the mass of a QCD jet increases, as shown in Fig. 8a for /3 = 2 in the px range [400, 500] GeV. 20 
In contrast, for a boosted heavy particle that decays to two partons, the C% distribution is 
relatively insensitive to the resonance mass, since the mass of such a jet comes mostly from 

(2) 

two partons from the decay regardless of the boost factor. Shown in Fig. 8b is the signal C2 
distribution for pp — > ZZ, where one of the Z bosons decays to leptons and the other decays 

to jets. We can manually adjust the mass of the Z in MadGraph5 to study several different 

(2) 

mass to transverse momentum ratios. For mz = {91,125,200} GeV, the C2 distributions 
are remarkably similar. 21 

In Fig. 9a, we show the QCD jet versus Z boson discrimination curve for mz = 91 GeV 
with pt £ [400, 500] GeV for several values of (3. To see how the physics changes as the 
resonance mass changes, we plot the QCD rejection rate for 50% boosted Z efficiency in 
Fig. 9b as a function of (3, for m z = {80, 91, 110, 125, 150, 200} GeV. At low masses, the most 
powerful discriminant is (3 ~ 1.5 — 2. This is expected, since large values of (3 emphasize soft 
wide-angle emissions where there is more of a penalty for QCD jets in the Sudakov peak. 
However, we do not have a quantitative way to understand why the discrimination power 

20 The labelled jet masses of mz = {80,91,110,125,150,200} GeV correspond to the jet mass ranges 
[70,90] GeV, [80,100] GeV, [100,120] GeV, [110,140] GeV, [140,170] GeV, and [180,230] GeV. 

21 C2 is not invariant to transverse boosts, so for more extreme values of m/pr, the distribution will move to 
smaller values. However, because of underlying event and initial state radiation, C2 does not change as much 
as one would naively expect under boosts. 
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Figure 10: Left: the discrimination curves for boosted hadronic Z bosons (mz = 91 GeV) 
compared to QCD jets with for various values of (3. For comparison is shown the G% 
curve with the best discrimination (f3 = 1.7). The subjet axes for iV-subjettiness are defined 
as those that minimize the /3 = 1 measure (broadening axes). Right: QCD rejection rate 
for 90% boosted Z efficiency as a function of (3, sweeping the value of the Z boson mass 
to mz = {91, 125,200} GeV. Because these curves are with 90% Z efficiency, they are not 
directly comparable to Fig. 9b. Note that as m/px decreases, the performance of C2 improves 
relative to T2,i. 



saturates at f3 ~ 2, as opposed to even higher values. At intermediate masses, a wide range 
of (3 values yield very similar results. At the high masses where QCD jets are in the tail 
region, the discrimination dependence on (3 inverts, with the most powerful discrimination 
for (3 ~ 0.5. This is likely to do with the same quark/gluon color factor discrimination as in 
Sec. 3. In particular, high mass QCD jets are formed by a hard perturbative splitting, which 
is most likely to be a gluon, whereas the Z jet has two hard quark subjets. That said, we 
have not yet performed a NLL calculation to understand why (3 ~ 0.5 is preferred, as opposed 
to even smaller values. 

Finally, it is instructive to compare the discrimination power of C2 to A r -subjettiness. 
The ratio of 2-subjettiness to 1-subjettiness t^i is defined in Eq. (2.17) and can be used to 
identify Z bosons decaying to two jets. To eliminate ambiguities in minimum axes finding 
at small values of /3, we choose to define the subjet axes by those that minimize the (3 = 1 
measure (i.e. the broadening axes). The discrimination curves of rff for mz = 91 GeV is 
plotted in Fig. 10a, with the curve with the most discriminating value from Fig. 9a 
shown for comparison. We also show the QCD rejection rate for 90% boosted Z efficiency in 
Fig. 10b. At low masses, performs as well as or better than t^x over the entire range of 
f3, except at very small values of (3. At high masses, the discrimination power of t^) becomes 
comparable to , since both observables lock onto the hard subjets in the Z decay of the 
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massive QCD jet. The increase in the relative discrimination power of C2 with respect to T2 1 
as the ratio m/pT decreases is expected from the discussion of Sec. 2.2.2. As m/pT decreases, 
soft wide-angle subjets become more important for determining the structure of the jet and 
C2 emphasizes these emissions more than t-l\- 

4.2 Boosted Higgs Identification 

One key application for 2- prong jet substructure observables is for identifying boosted Higgs 
bosons in the decay H — > bb. Compared to the case of Z bosons, there is additional informa- 
tion from the presence of b quarks (and the resulting B hadrons) in the final state, which can 
be used to mitigate QCD backgrounds. Thus, to identify boosted Higgs bosons decaying to 
bottom quarks, we employ three criteria. First, we require the jet to have a mass comparable 
to the Higgs boson. Second, we demand that two B hadrons are tagged in the jet. Third, we 
use a sliding cut on dp to test for two hard subjets in the jet. 

Because we demand that the jet have two B hadrons, the largest QCD background to 
Higgs decays to bottoms is gluon splitting to bottoms. The splitting function g — > bb does 
not have a soft singularity, so the bottoms from this splitting will have comparable energies. 
This is also the case for Higgs decay, so we do not expect the same discrimination power for 
Higgs bosons compared to Z bosons studied above. That said, because of the difference in 
the total color of the jets, there is an additional handle on Higgs versus gluon discrimination; 
the bottom quarks from the gluon splitting will be in a color octet state, so there will be 
significantly more radiation at wide angles compared to Higgs jets. 

This color octet versus color singlet distinction can be exploited in two ways. First, more 
wide-angle radiation can be included in the jet if the jet radius is increased. Larger jet radii 
improve the contrast for CP , since more wide-angle radiation is included in the (octet) gluon 
jets compared to the (singlet) Higgs jets. Second, the value of j3 can be set to accentuate 
the importance of wide-angle emissions in the jet. As /3 increases, more weight is given to 
wide-angle emissions, further penalizing gluon jets compared to Higgs jets when using CP . 

A full study of boosted Higgs identification using dP is beyond the scope of this work, 
but we can get a sense for the discrimination power of dP by comparing the boosted Higgs 
signal pp — > ZH to the leading QCD background of pp — > Zbb where both bottom quarks 
happen to be clustered into the same jet. We generate both the signal and background 
distributions for the 8 TeV LHC in MadGraph5 1.5.0 [109] plus Pythia 8.165 [86, 87], with 
all ground state B hadrons stable to allow for naive 6-tagging of the jets (with 100% efficiency 
and no mistags). The mass of the Higgs is set to 125 GeV, and the Z is decayed to leptons 
and the Higgs is decayed to bb. We consider anti-kx jets with various values of the jet radius 
Rq = {0.6,0.8,1.0,1.2}. The leading jet is required to have transverse momentum in the 
range [400, 500] GeV with exactly two 5-hadrons as constituents. To approximate realistic 
6-tagging within jets, we recluster the jet with the k? algorithm to find two exclusive subjets, 
and we require that each subjet contain exactly one identified £>-hadron. Finally, the mass 
of the jet is required to be in the window of mj S [110, 140] GeV (i.e. within 15 GeV of the 
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Figure 11: Distributions of C2 for 66 jets from QCD (left) and Higgs bosons decaying to bb 
(right) with different jet radii. The plotted events are in the mass window mj 6 [110, 140] GeV 
and the transverse momentum window px £ [400, 500] GeV. 
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Figure 12: Left: Discrimination curves of H 
ci^ for several values of (3 with jet radius Rq 



50% boosted H -> 
i?o = {0.6,0.8,1.0,1.2}. 



66 jets versus 66 jets from QCD with 
1.0. Right: QCD 66 rejection rate for 
efficiency as a function of (3, sweeping the value of the jet radius 



Higgs mass). From the leading jet, we compute C!f for various values of (3 and determine 
the discrimination power. 

(2) 

In Fig. 11, we plot the distributions of C\ for Higgs jets and the QCD background for 

(2) 

various jet radii. As expected, the Cv; distributions dramatically increase as the jet radius 
increases for QCD jets, while they only increase slightly for Higgs jets. In Fig. 12a, we plot 
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Figure 13: Left: the discrimination curves for boosted H — >• bb compared to QCD bb jets 
with for various values of /3. For comparison is shown the curve with the best 
discrimination (/3 = 2.0). Right: QCD bb rejection rate for 50% boosted H —> bb efficiency as 
a function of (3, sweeping the value of the jet radius Rq = {0.6,0.8, 1.0, 1.2}. 



the discrimination curves of Higgs jets versus QCD using C\ for several values of /3 for the 
jet radius i?o = 1-0. As expected, the discrimination power increases both as the angular 
exponent increases, again, a consequence of the greater amount of wide-angle radiation in the 
QCD jets. Fig. 12b shows the f3 dependence on the QCD rejection rate for 50% boosted Higgs 
efficiency for jet radii of Rq = {0.6, 0.8, 1.0, 1.2}. The rejection rate increases dramatically as 
the jet radius increases. At small jet radii, large values of (3 lead to the best discrimination, 
as large f3 emphasizes wide-angle emissions which differ for QCD and boosted Higgs jets. As 
the jet radius increases, the largest QCD rejection rate moves to intermediate values of (3. 
This may be because a large jet radius will tend to include more initial state radiation or 
underlying event, which is independent of the dynamics of the jet. 

In Fig. 13a, we compare the discrimination performance of the A-subjettiness ratio 

to the most discriminating (/3 = 2.0) with jet radius equal to Rq = 1.0. Over the entire 
range of signal efficiencies, C% performs better than t^x for any value of f3. In Fig. 13b, we 
plot the QCD rejection rate for 50% boosted Higgs efficiency at several jet radii. For a jet 
with a small jet radius, C% performs significantly better than any A-subjettiness, with the 
distinction decreasing as the jet radius increases. 

5 Boosted Top Quarks with C 3 

Our final case study tests the discrimination power of even higher-point correlation functions, 
namely using C3 to distinguish boosted top quarks from QCD jets. 22 Unlike the previous 

22 For related studies, see Refs. [3, 6-9, 42, 43, 49, 50, 55, 110-117]. 
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two case studies, this observable is significantly more challenging than lower point correla- 
tion functions. From a computational point of view, C3 involves a 4-point correlator, so its 
computational cost is expensive since it scales like k 4 , where k is the number of particles in 
the system. That said, our FastJet add-on only requires a few milliseconds to analyze a 
boosted top event at one value of (3. From an analytical point of view, each term in ECF(4, /3) 
involves a product of 4 energies and (2) = 6 angles, complicating an understanding of how 
C3 behaves in various limits. 

We will find that C3 performs significantly worse than one might expect from the strong 
performance seen in C\ and Ci- While it is possible that this is an artifact of choosing the 
particular double ratio combination in the definition of C3, we suspect that the proliferation 
of energy and angular factors in ECF(4, /?) is reducing the sensitivity of C3 to any individual 
soft emission. In particular, for a soft-collinear emission, C\ and C2 are independent of the 
kinematics of the hard structure of the jet. By contrast, even for a soft-collinear emission, 
C3 retains dependence on the hard kinematics of the jet. This is because the correlation 
functions in the ratio defining C3 are dominated by possibly different subsets of the hard 
emissions in the jet. Nevertheless, it is illustrative to see that even with these limitations, 
there is still discrimination power in C3. 

To study the performance of C3 as a top tagger, we use the boosted top and QCD back- 
ground event samples created for the BOOST 2010 workshop [l]. 23 These events come from 7 
TeV LHC collisions simulated with Herwig 6.510 [118] with underlying event simulated with 
JIMMY [93] with an ATLAS tune [119]. The event samples consist of 2 — > 2 QCD processes, 
either all hadronic ti production or dijet production. For direct comparison to other top 
tagging procedures, we follow the analysis procedures used in Ref. [1]. We identify anti-A^ 
jets with radius Ro = 1.0 and demand that the jets have pt G [500,600] GeV. No detector 
simulation is performed at this stage, other than removing muons and neutrinos before jet 
finding. We impose three cuts to discriminate top jets from QCD. First, we demand that the 
jets have mass in the fixed window of 160 < mj < 240 GeV, and second, we apply a sliding 
cut on C3 . In addition, it was noted in Ref. [14] that ratio observables such as C3 can be 
IR-unsafe without an additional cut. We therefore apply a third cut that C% > 0.1, which 
makes C3 explicitly IR-safe. 

In scanning over the range of 0.5 < f3 < 2.5, we found that the best discrimination over 
a wide range of signal efficiencies using C3 is obtained for (3 = 1.0. This is the same /3 
value that is optimal for iV-subjettiness T32 [50]. A plot of the distribution of for top 
jets and QCD jets in the kinematic and mass windows from above is shown in Fig. 14a. The 
discrimination curves for the different values of (3 are shown in Fig. 14b, where the quoted 
efficiencies only include the effect of a cut on the observable C3 for jets in the mass window 
of 160 to 240 GeV. 

23 The events can be found at http://www.lpthe.jussieu.fr/-salam/projects/boost2010-events/ and 
http : //tev4 .phys .Washington. edu/TeraScale. While updated event samples are available from the BOOST 
2011 report [2], the comparison includes a W subjet tagger which would artificially improve the performance 
of C 3 . 
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Figure 14: Left: Distribution of comparing top jets and QCD jets. The plotted events 
are in the mass window mj £ [160, 240] GeV and the transverse momentum window pt 6 
[500, 600] GeV. Right: Discrimination curves for top jets versus QCD jets, using Cf for 
several values of f3. These efficiencies only include the effect of the cut on C3. 
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Figure 15: Comparing the performance of to other methods studied in the BOOST 2010 
report [1]. The efficiency curves for iV-subjettiness (T3/T2) [50] and the angular correlation 
function (ACF) [55] were added later. Here, the efficiencies include both the effect of a mass 
cut as well as a cut on Ci . 



Finally, we compare the performance of C3 against several other top tagging procedures 
in Fig. 15. For this plot, the quoted efficiencies include both the effect of the mass cut as 
well as the effect from a cut on C3 . While not as powerful as methods like iV-subjettiness, 
the energy correlation function yields comparable discrimination power to other methods. Of 
course, the performance may be improved by combining information from different values of 
/3, as well as including additional C2 and C\ information. 
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6 Conclusions 



In this paper, we have introduced arbitrary-point energy correlators that are sensitive to TV- 
prong substructure. These correlators are effective when used as part of an energy correlation 
double ratio C^\ though more general combinations deserve further exploration. Through 
an NLL calculation, we have seen how C\ yields excellent quark/gluon discrimination, with 
j3 ~ 0.2 being most effective at capturing the differences in color charges. We have also shown 
the power of C2 for boosted 2-prong objects like Higgs bosons, and the potential power of C3 
for boosted 3-prong objects like top quarks. 

Given the explosion of jet substructure methods over the past few years, it is worth 
asking whether Cn is sufficiently novel to merit further experimental and theoretical studies. 
Indeed, it is a quite unique variable that combines a number of desirable features. Like N- 
subjettiness, Cn is a variable which tests for iV-prong substructure, but can behave more 
continuously in situations with soft subsets. Like planar flow and related jet shapes, Cn can 
be calculated directly from the energies and angles of the jet constituents without a separate 
axes finding step, but it is designed for identifying iV-prong substructure instead of just 
exotic kinematic configurations. Finally, like jet angularities, Cn is sensitive to higher-order 
radiation about LO substructure, but because it is a recoil-free observable, it can better probe 
the collinear physics that distinguishes a jet's color with 0.2 < (3 < 1.0. Because Cn has a 
high computational cost for N > 3, we expect Cn will be most useful in practice for 1-, 2-, 
and 3-prong jet studies. 

To gain further confidence in the behavior and performance of these observables, further 
analytic studies are needed. Of particular need is to calculate C2 for QCD backgrounds. 
We already saw that the behavior of C2 for QCD jets depends strongly on the jet's mass 
over pt ratio, and it is likely that different theoretical descriptions will be needed for C2 as 
a function of m/px- While C2 is built as a ratio of IRC safe observables, C2 itself is only 
IRC safe with a cut on the jet mass (which acts like a cut on the denominator), and it is 
an interesting question how to best perform NLL resummation for generic ratio observables. 
Like all jet shape observables, C2 is sensitive to underlying event, initial state radiation, and 
pileup, which must be accounted for in determining the optimal (3 value. Ideally, theoretical 
progress on understanding C2 and other jet shapes will match the rapid experimental progress 
in implementing them, such that jet substructure observables can truly be a robust tool for 
LHC physics. 
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A Fixed-Order Calculation 

In this appendix, we present the details of the fixed-order calculation of C\ and matching 
to the NLL result from Sec. 3.2. The calculation is valid for any jet algorithm that, for 
configurations involving exactly two partons in some neighborhood, combines two partons 
into a single jet if they are separated by an angle less than Rq and otherwise places them in 
separate jets. At this order we define a quark jet to be a jet that contains a single quark, 
or a quark and a gluon. A gluon jet is a jet that contains a single gluon, a gluon pair, or a 
quark-antiquark pair (of identical flavor). 24 

In the limit where the jet radius Rq is small, the 0(a s ) cumulative distribution of the 
observable can be computed from 

E(d) = i + ^ J"° f £ dz p(z) e (ct - z(i - K )eP) 

l+u 
2 



1 f ^ T>t-\ i„ z ^-z)R! Q 



, dzP(z)\n " > ° , (A.l) 

p J Cl 



1-u 
2 



where 



(A ' 2) 

and we have approximated the full matrix element by the appropriate splitting function, P(z), 
as is legitimate for Rq -C 1. The splitting functions are 

P q (z) = Cf 1 -^ 2 - , (A.3) 
1 — z 



24 Algorithms that satisfy the condition for when they pair partons into a single jet include all members of 
the generalized- kr family, notably the anti-fcr algorithm [85]. One subtlety is that the flavor of jets from such 
algorithms is not infrared safe for configurations with three or more particles in a common neighborhood. There 
exist algorithms designed specifically to guarantee a safe jet flavor to all orders, the "flavor-fcT" algorithms [76]. 
These, however, have the property that a quark-antiquark pair can be combined into a jet even for angular 
separations larger than Ro, and so they do not yield the same jets at O (a a ) as the generalized-fcr family and 
as we assume in the calculation here. We have reason to believe that it is possible to design an algorithm that 
is both equivalent to generalized-fcr at O (cc s ) and flavor safe to all orders, but leave an investigation of this 
question to future work. 
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for gluons, including combinatoric factors. For quarks, it follows that 
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For gluons, the fixed-order cumulative distribution is 
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Here, Li2(x) is dilogarithm function 

Li 2 (x) 



H — tanh 1 (u) 
3 



(A.6) 



(A.7) 



To match the fixed-order cumulative distribution to the NLL cumulative distribution, we 
use the "Log-R" matching scheme [120]. In this matching scheme, the fixed-order corrections 
are exponentiated with the NLL distribution. The logarithms that appear in the fixed- 
order expression must be properly subtracted so as to eliminate a double counting with the 
logarithms that were resummed. Also, at large values of C\, we want the distribution to agree 
with the fixed-order result. This requires "turning off" the logarithms of the resummation 
properly. 

Matching 0(a s ) fixed-order to NLL is straightforward. The matching scheme can be 
written as 

S match = S (L) rcsum e -^-G - Gl L- G2 L>) (A>8) 
Here, R\ is defined from the fixed-order cumulative distribution as 



X = l-°^R 1 + 0(a 2 s ), 



(A.9) 



and Go, G\L and G2L 2 are placeholders representing the constant terms, single logarithms 
and double logarithms that have been resummed, respectively. For quarks, the logarithms 
are 



(G x L + G 2 L) q -— ln _--_ln— . 



(A.10) 
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and for gluons the logarithms are 

(c,™^^-^^!^. (A .n, 

The choice of Go in S(L) resum is arbitrary because these terms are subleading to the NLL 
resummation. Subtracting these logarithms from R±, in addition to the constant terms Go, 
eliminates double counting. Also, to verify that the distribution agrees with the fixed-order 

result at large values of C\, we can shift the argument of the logarithms appropriately to 

R 13 

vanish when C\ = which is the maximum value of C\. That is, we replace the logarithms 
in the resummation and subtraction to be 



L^L = Tnl& -^ + lj=ln(^-3] . (A.12) 

L vanishes when C\ = and smoothly interpolates to L in the small C\ region. The final 
NLL resummed cumulative distribution matched to fixed-order is 



-"match v , lvsum 



E(L) e -^(Hi-Gi£-G^ a ) . (A.13) 

V / rcsum v ' 

We use this expression to determine the quark versus gluon discrimination in Sec. 3.2. 
B Breakdown of Perturbative Calculation 

In this appendix, we provide some simple quantitative arguments for the breakdown of the 
perturbative calculation from Sec. 3.2 for values of j3 less than about 0.2. There are two effects 
that we will consider: the QCD Landau pole and the breakdown of the independent emission 
approximation. Of course, there may be other effects that become important at small values 
of /3, but these nevertheless suggest that our perturbative calculation of the quark versus 
gluon discrimination power ceases to make sense at very small values of (3. 

First, the QCD Landau pole. At small /3, the smallest scale Qq that the running coupling 
is sensitive to is 

Q = PT R e- L /P . (B.l) 

The perturbative calculation can be trusted when Qq ^> Anp, where Anp is a scale at which 
a s becomes non-perturbative. We can estimate the value of f3 at which the non-perturbative 
effects become important as follows. The logarithm of the observable C\ can be roughly 
estimated from the LL, fixed-coupling expression for the cumulative distribution S for quarks, 
where 

S ~ e - r . (B.2) 

Then, L in terms of S is 

a \ I/ 2 
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Demanding that Qq > Anp and using the expression for L from the above equation we find 
that 

7rjnl/g 1 

A NP 

Because we have used a fixed- coupling approximation, it is not immediately clear at what 
scale a s should most appropriately be evaluated. Taking it at the geometric mean of pxRo 
and Anp — 0.5 GeV gives a s ~ 0.17. Plugging this into Eq. (B.4), for a quark efficiency of 
50% and a jet selection as in Sec. 3.2, yields 



Anta ~ 0.17 C F In 2 400 x 5 °- 6 ~ 0,25 ' (B ' 5) 

suggesting that non-perturbative effects become critical for /3 < 0.2-0.3. One can also perform 
such an analysis numerically using the full NLL expressions for Eg, including all running- 
coupling effects, and one reaches a similar conclusion. 

Second, the NLL calculation assumed that emissions could be treated as independent, 
but multiple emissions cannot be regarded as independent when each emission can take an 
0(1) fraction of the energy of the jet. That is, if the logarithm of C\ is not large then our 
analysis (appropriate for soft-collinear emissions) breaks down. Assuming as above that the 
cumulative distribution can be written as in Eq. (B.2), the minimal value of j3 is 



where L m ; n is the minimal value for the logarithm at which we trust the soft-collinear anal- 
ysis. Assuming that the soft-collinear analysis fails when L m ; n ~ 2, with the same choice of 
parameter values as above, including a s ~ 0.17, /3 m ; n is 

/3 min ^— ^2^0.41. (B.7) 
7r in 2 

The precise value of L m { n at which the soft-collinear analysis is deemed to break down will 
change this value. Nevertheless, multiple hard, collinear emissions become important and 
result in a breakdown of the analysis when f3 is too small. To include the leading effect of 
energy conservation among emissions, one must match the NLL resummation to NLO (0(a 2 )) 
splitting functions. 

It should be noted that the fact that the non-perturbative analysis and the multiple emis- 
sions analysis give the same ballpark of /3 m i n is a coincidence due to the choice of parameters 
that were made. 
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